A Unified Programmable Edge Matrix Processor for Deep Neural Networks and Matrix Algebra

نویسندگان

چکیده

Matrix Algebra and Deep Neural Networks represent foundational classes of computational algorithms across multiple emerging applications like Augmented Reality or Virtual Reality, autonomous navigation (cars, drones, robots), data science, various artificial intelligence-driven solutions. An accelerator-based architecture can provide performance energy efficiency supporting fixed functions through customized paths. However, constrained Edge systems requiring diverse matrix operations to be efficiently supported, cannot afford numerous custom accelerators. In this article, we present MxCore, a unified that comprises tightly coupled vector programmable cores sharing highly optimized interconnects along with configurable hardware scheduler managing the co-execution. We submit MxCore as generalized approach facilitate flexible acceleration Deep-learning range sparsity levels. Unified compute resources improve overall resource utilization per unit area. Aggressive novel microarchitecture techniques block-level support optimize data-reuse minimize bandwidth power requirements enabling ultra-low latency for low-power cost-sensitive deployments. requires small silicon footprint 0.2068 mm 2 , in modern 7-nm process at 1 GHz achieves (0.15 FP32 0.62 INT8) TMAC/mm dissipating only 11.66 μW leakage power. At iso-technology iso-frequency, provides an 651.4×, 159.9×, 104.8×, 124.2× compared 128-core Nvidia’s Maxwell GPU dense General Multiply, sparse Network, Cholesky decomposition, triangular solve respectively.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of Deep Neural Networks with Extended Data Jacobian Matrix

Deep neural networks have achieved great success on a variety of machine learning tasks. There are many fundamental and open questions yet to be answered, however. We introduce the Extended Data Jacobian Matrix (EDJM) as an architecture-independent tool to analyze neural networks at the manifold of interest. The spectrum of the EDJM is found to be highly correlated with the complexity of the le...

متن کامل

Matrix Neural Networks

Traditional neural networks assume vectorial inputs as the network is arranged as layers of single line of computing units called neurons. This special structure requires the non-vectorial inputs such as matrices to be converted into vectors. This process can be problematic. Firstly, the spatial information among elements of the data may be lost during vectorisation. Secondly, the solution spac...

متن کامل

Processor Allocation for Matrix

In this paper, we present the problem of allocating processors for matrix products. First, we consider how many processors should be allocated for computing one matrix product on a parallel system. Then, we discuss how to allocate processors for a number of independent matrix products on the parallel system. In many cases, it is shown that the performance of parallel algorithms does not improve...

متن کامل

Implementation of a programmable neuron in CNTFET technology for low-power neural networks

Circuit-level implementation of a novel neuron has been discussed in this article. A low-power Activation Function (AF) circuit is introduced in this paper, which is then combined with a highly linear synapse circuit to form the neuron architecture. Designed in Carbon Nanotube Field-Effect Transistor (CNTFET) technology, the proposed structure consumes low power, which makes it suitable for the...

متن کامل

A Random Matrix Approach to Neural Networks

R n×p is a matrix of independent zero-mean unit variance entries, and σ : R → R is a Lipschitz continuous (activation) function — σ(WX) being understood entry-wise. We prove that, as n, p, T grow large at the same rate, the resolvent Q = (G + γIT ) , for γ > 0, has a similar behavior as that met in sample covariance matrix models, involving notably the moment Φ = T n E[G], which provides in pas...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions in Embedded Computing Systems

سال: 2022

ISSN: ['1539-9087', '1558-3465']

DOI: https://doi.org/10.1145/3524453